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Abstract. The computational complexity of the "cluster minimization problem" is 
revisited [L. T. Wille and J. Vennik, J. Phys. A 18, L419 (1985)]. It is argued that the 
original NP-hardness proof does not apply to pairwise potentials of physical interest, 
such as those that depend on the geometric distance between the particles. A geometric 
analog of the original problem is formulated, and a new proof for such potentials is 
provided by polynomial time transformation from the independent set problem for unit 
disk graphs. Limitations of this formulation are pointed out, and new subproblems 
that bear more direct consequences to the numerical study of clusters are suggested. 
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The present contribution addresses the inherent computational difficulty of finding 
the ground state configuration R = (r 1; . . . , rjy) of a classical iV-particle system with 
total potential energy 

N N 



where each rj e I 3 . Though fairly general in scope, this problem is of particular interest 
in the characterization of the energy landscapes of atomic clusters (see e.g. [HE]), and 
bears important consequences to the efficiency of Monte Carlo sampling methods for 
clusters at low temperatures. 

When addressing such fundamental aspects of optimization problems, a primary 
goal is to establish whether the given problem is "NP-hard," a complexity class also 
known as intractable since their solution in the worst-case scenario requires a number 
of computational steps that increases exponentially with the "size" of the problem 
(intractability hinges upon the validity of the P 7^ NP conjecture, which is nevertheless 
widely accepted, see [3J E] for details, or [H] for a gentle introduction aimed at the physics 
audience). Establishing the membership of a problem in the NP-hard class is, on the 
one hand, of fundamental importance to the field of computational complexity, for a 
given polynomial time solution to the problem would collapse the entire NP class into 
P, and on the other hand of practical interest to the problem itself, for it indicates that 
the chances of finding a general-purpose algorithm with polynomial efficiency across 
all instances of the problem are very small, thereby encouraging the development of 
specialized algorithms that can benefit from whatever physical insight one has into 
the specific instance of the problem at hand. Proofs of NP-hardness are especially 
welcome in light of recent efforts to characterize the boundaries between exponential and 
polynomial behavior in "typical" instances of NP-complete problems (see e.g. jEllZllHl), 
an endeavor that has the potential of providing valuable guidance for dealing with NP- 
hard optimization problems. 

A study on the computational complexity of classical systems interacting via 
pairwise potentials has been reported in Ref. jHJ, a celebrated work that is frequently 
cited in the literature as providing a proof that the "cluster minimization problem" 
is NP-hard (see e.g. (TOl EU EES E2 El CSj). It is part of the goal of the present 
work to argue that this proof is too general, and that a different proof is necessary for 
physically relevant potentials such as those described by Eq. (JTJ) (i.e. those that depend 
on the geometric distance between the particles), and hence that the NP-hardness of the 
"physical" cluster minimization problem is still an open issue. The remainder of this 
paper is thus divided in three parts: a critical analysis of Ref. jH] , a proof of NP-hardness 
valid for potentials of the form of Eq. (fT|). and a discussion on the limitations of this 
proof. 

One of the simplest ways of proving that a given problem tt is NP-hard is by 
"restriction," i.e. by showing that a specific instance of the problem ir coincides 
with a problem 7r' already known to be NP-hard (much of the present discussion on 
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computational complexity was adapted from j2J). Often times, however, the connection 
between the problem of interest and a known intractable problem is not immediate; in 
this case, one can still establish the NP-hardness of tt by "polynomial transformation" 
from tt', i.e. by showing that every instance V of tt' can be transformed to an instance 
/ = /(/') of tt in polynomial time, and that a solution of / also solves I'. The NP- 
hardness proof of Wille and Vennik was based on such a transformation jU] : Formulating 
the cluster minimization problem in terms of "graphs" (see e.g. 4J), these authors 
considered what will be henceforth referred to as the WEIGHTED EDGE problem, showing 
that the TRAVELING SALESPERSON (TSP) problem can be transformed to WEIGHTED EDGE; 
since TSP is known to be NP-hard, so must be WEIGHTED EDGE. In the following, I will 
analyze the proof offered by these authors, arguing that it requires pair "potentials" 
that are far more general than those of physical interest, which leaves the possibility 
that the case of more restrictive and physically relevant potentials may be treated by a 
polynomial-time algorithm. 

First, let us state Wille and Vennik's problem following the standard format of 
Garey and Johnson [3]: 

[WEIGHTED EDGE] Instance: A complete graph G — (V,E), a weight function 
w(e) for each e G E, and a number N < \V\. Problem: Find a subgraph 
G' = (V, E') of size \V'\ = N such that 

w(e) is minimal. 

eeE' 

Here a complete graph G = (V,E), where V are the vertices and E are the edges 
connecting the vertices, is one in which every pair of vertices is connected by an edge 
(i.e. a "clique"), and the notation \V\ indicates the number of vertices. An illustration 
of this problem is provided in Fig. ^ 

When formulating the cluster minimization problem as WEIGHTED EDGE, the authors 
of Ref. P had in mind the following analogy: V is the set of "sites" that the N particles 
are allowed to occupy, and for every pair of sites with edge e = (v{,Vj), the function 
w(e) = w(vi,Vj) measures the strength of the interaction between two hypothetical 
particles placed in these sites. The set V is of course discrete, reflecting the discretization 
of the problem that accompanies any computational method, but notice that no specific 
structure is given to either V or w(e) (this arbitrariness will be discussed later in the 
text). 

As already mentioned, the proof that WEIGHTED EDGE is NP-hard was provided 
in Ref. [Oj by showing that it "contains" TSP, more precisely by showing that there 
exists a weight function w(e), a choice of \V\ as a function of N, and a labeling of the 
vertices in terms of pairs of "cities," such that a solution of WEIGHTED EDGE would yield 
a minimum TSP tour.f Though not stated in j^j, it is easy to check that this labeling and 

| A more direct proof that WEIGHTED EDGE is NP-hard can be given. Indeed, given any graph 
G* = (V*,E*), construct a complete graph G = (V*,E), setting w(v*,v*) = 1 if there is a 
corresponding (v*,v*) g E*, and w(v*,v*) = 2 otherwise. The solution of WEIGHTED EDGE for the 
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Figure 1. Illustration of the WEIGHTED EDGE problem, with vertices represented by 
circles and edges by dotted lines. The problem is to choose N out of all the vertices 
such that the sum of the corresponding edge weights w(e) is minimal. Although the 
entire graph is complete, for clarity of illustration only edges connecting the vertices 
of a hypothetical solution with N = 5 are shown. 



choice of w(e) can be done in polynomial time given any instance of TSP. One additional 
observation not explicitly mentioned by Wille and Vennik is that, since the NP-hardness 
of TSP means that worst-case solutions to the problem are found in time exponential 
in the number of cities, their result implies only that worst-case solutions to WEIGHTED 
EDGE are found in time exponential in the number of sites, not in the number of particles 
N as one would like to show (recall that the general instance of WEIGHTED EDGE does 
not tie \V\ to N). Nonetheless, it stands to reason that any algorithm relevant to the 
cluster minimization problem will adjust the size of V so as to ensure comparable spatial 
resolutions for clusters of different sizes, and a reasonable choice would be a polynomial 
in N, i.e. \V\ oc N p for some integer p > 0. Thus, the NP-hardness of WEIGHTED EDGE 
with this additional property says that the completion time of any algorithm that solves 
the problem will indeed scale exponentially with the number of particles N. 

Finally, observe that the choice of w(e) that allows the connection with TSP (Eq. (4) 
in Ref. [Oj) is too general to reflect physical problems of interest; in particular, it requires 
the possibility that the "potential" between two sites depends on each individual site 
in question, and not just on their relative geometric distance. Although the authors 
state that their proof remains true for spherically symmetric potentials "by comparing 
it to the general and the Euclidean traveling salesperson problem" ||, this is far from 
clear: It is difficult to imagine a polynomial time transformation from a generic instance 
of the (Euclidean or not) TSP problem to an instance of the geometric WEIGHTED EDGE 

graph thus constructed has total weight equal to N(N — l)/2 if and only if G* contains a clique of 
size N, and the foregoing transformation can be done in polynomial time. But determining if a graph 
contains a clique of a given size is known to be an NP-complete problem (CLIQUE, see JH]), and thus 
WEIGHTED EDGE must be NP-hard, q.e.d. 
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problem, where the vertices are points in the Euclidean space and the weights w(e) are 
geometric distances (see CLUSTER MINIMIZATION below). The proof of the Euclidean 
TSP that the authors refer to is quite elaborate, and requires methods other than simple 
restriction |16j . and it is also not clear how this proof generalizes to their problem. In 
the following, an explicit proof for potentials that depend on the geometric distance 
between the particles will be provided. 

The problem of minimizing the energy of N particles with spherically symmetric 
pairwise interaction can be stated as: 

[CLUSTER MINIMIZATION] Instance: A set of sites S in R 3 , a number of 
particles N < \S\, and a potential function u(r). Problem: Find a subset 
S' = {ri, . . . , r^v} of S with size \S'\ = N, such that 



Before we proceed, note the crucial difference with respect to WEIGHTED EDGE: In 
CLUSTER MINIMIZATION, it is the relative geometric position of the sites that determines 
the strength of the interaction potential. The proof of Ref. t 9j for the WEIGHTED EDGE 
problem does not apply here, and likewise there are several subproblems of WEIGHTED 
EDGE not amenable to that proof which are nevertheless NP-hard by restriction to 
CLUSTER MINIMIZATION (e.g. the case where the interaction depends on the vectorial 
distance, u = u(rj — can clearly be restricted to spherically symmetric potentials). 

It will now be shown that the unit disk graph (UDG) problem UDG INDEPENDENT 
SET, a decision problem known to be NP-complete , can be transformed to CLUSTER 
MINIMIZATION in polynomial time. The problem can be stated as: 

[UDG INDEPENDENT SET] Instance: A set D of disks of unit radius in R 2 , and 
an integer K < \D\. Problem: Does D contain an independent set of size K, 
i.e. a subset D' of D with K disks such that the disks never overlap with each 
other? 

The problem UDG INDEPENDENT SET is one among various graph-theoretic problems that 
remain intractable when the graphs are restricted to be of the unit disk type, i.e. when 
vertices can be represented by disk centers, and edges by causing the corresponding 
disks to overlap in space (see illustration in Fig. EJ). The problem of determining 
whether an arbitrary graph can be represented by a unit disk graph is also NP-complete 
[TH] . and so far the only interesting problem in this context that allows a polynomial 
time solution seems to be UDG CLIQUE 17 . 

The proof proceeds by choosing the sites S of CLUSTER MINIMIZATION to coincide 
with the center of the disks D of a given instance of UDG INDEPENDENT SET (this can 
clearly be done in polynomial time, and effectively places the sites in a plane), and 
setting 
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u(\rj — Ti\) is minimal. 
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Figure 2. Illustration of a unit disk graph. Left: a graph drawn in the "standard" 
format; Right: the corresponding UDG. Note that the problem of determining whether 
an arbitrary graph can be represented as a UDG is itself NP-hard |18j . 



Therefore, this instance of CLUSTER MINIMIZATION has a solution with total energy 
equal to N(N — l)/2 if and only if D has an independent set with iV disks. In other 
words, an answer to UDG INDEPENDENT SET is immediately available from a solution of 
the corresponding CLUSTER MINIMIZATION problem (recall that the evaluation of the 
energy of an iV-particle system interacting with pairwise potentials can be done in 
polynomial time, more explicitly in 0(N 2 ) steps). Thus, since UDG INDEPENDENT SET 
is NP-complete, it must be that CLUSTER MINIMIZATION is an NP-hard problem, q.e.d. 
(Note the similarity of this argument to that used to show that HAMILTONIAN CIRCUIT 
can be transformed to TSP in |3j, pp. 35-36). 

The above formulation of the cluster minimization problem is subject to some 
caveats. As already discussed in the case of WEIGHTED EDGE, the assumption that the 
total number of sites \S\ grows with the number of particles N is required in order to have 
an intractability result in terms of N. A more subtle limitation of the present formulation 
can be appreciated by stating the foregoing proof in words: Given an arbitrary set of sites 
in M. 3 and a spherically symmetric but otherwise arbitrary pair-potential energy function 
u{r), no algorithm can find the minimum energy configuration of the system in time 
bounded by a polynomial in the number of particles N . (As usual, this result hinges upon 
the P 7^ NP conjecture, see the introductory paragraphs above). This is perhaps not the 
most relevant result for the cluster minimization problem, since in practice one seldom 
uses a discretization different from the simple cubic lattice (this happens e.g. when 
the components of r are represented by finite-precision, binary numbers). Ideally, one 
would like to show that for any "reasonable" discrete representation of the continuum 
(see e.g. (EH for a formalization and examples of this concept), the arbitrariness in the 
choice of u(r) by itself is sufficient to make the problem NP-hard, i.e. that for any such 
lattice, there are choices of u(r) leading to intractable solutions. The foregoing proof of 
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course fails when the sites are arranged in a simple cubic structure, for the existence of 
an independent set of a given size in this case is straightforwardly determined by the 
radius of the disks and the lattice spacing. In fact, it is known that a polynomial time 
algorithm exists for finding independent sets in unit disk graphs restricted to "grid" 
structures The NP-hardness of this subproblem seems considerably more difficult 
to prove than the more general CLUSTER MINIMIZATION problem introduced in this 
study, and would certainly be a welcome result in the literature. 

In summary, a critique of the celebrated NP-hardness proof of the cluster 
minimization problem [5| was offered, showing that a separate proof is necessary when 
the pairwise potential is a function of the geometric distance between the particles only. 
Using a geometric generalization of the formulation adopted in Ref . jU] , an independent 
proof for such cases was presented by transformation from the independent set problem 
for unit disk graphs [T7j . Limitations of this new proof were discussed in the last 
paragraph, and a subproblem that fixes the lattice structure was suggested as being 
more relevant to traditional numerical implementations of the cluster minimization 
problem. To this author's knowledge, however, the intractability of this subproblem 
remains unknown. 
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